> ## Documentation Index
> Fetch the complete documentation index at: https://mintlify.com/Anny26022/chartsmaze_clone/llms.txt
> Use this file to discover all available pages before exploring further.

# Phase 3: Base Analysis

> Builds the master JSON structure by merging all fetched data

## Overview

Phase 3 is the **critical synthesis stage** where all data from Phase 1 and Phase 2 is merged into a single unified JSON structure. This phase runs a single script that produces the base `all_stocks_fundamental_analysis.json` file.

<Warning>
  If `bulk_market_analyzer.py` fails, the pipeline stops. Phase 4 scripts cannot proceed without the base JSON file.
</Warning>

## Execution Order

Phase 3 runs **one critical script**:

<Steps>
  <Step title="Build Master JSON">
    **Script:** `bulk_market_analyzer.py`

    Merges fundamental data, technical indicators, and listing dates into a unified structure.
  </Step>
</Steps>

***

## Script: bulk\_market\_analyzer.py

### Purpose

Merges data from multiple sources to create the base JSON with 86 fields per stock across 2,775 stocks.

### Input Files

<CodeGroup>
  ```plaintext Required Files theme={null}
  fundamental_data.json              (Phase 1)
  master_isin_map.json               (Phase 1)
  dhan_data_response.json            (Phase 1)
  advanced_indicator_data.json       (Phase 2)
  nse_equity_list.csv                (Phase 1)
  ```

  ```python Loading Logic theme={null}
  import json
  import csv

  BASE_DIR = os.path.dirname(os.path.abspath(__file__))

  # Load fundamental data
  with open(os.path.join(BASE_DIR, "fundamental_data.json"), "r") as f:
      data = json.load(f)

  # Load technical data
  with open(os.path.join(BASE_DIR, "dhan_data_response.json"), "r") as f:
      dhan_data = json.load(f)
      dhan_tech_map = {item["Sym"]: item for item in dhan_data}

  # Load advanced indicators
  with open(os.path.join(BASE_DIR, "advanced_indicator_data.json"), "r") as f:
      adv_data = json.load(f)
      adv_tech_map = {item["Symbol"]: item for item in adv_data}

  # Load listing dates
  with open(os.path.join(BASE_DIR, "nse_equity_list.csv"), "r") as f:
      reader = csv.DictReader(f)
      listing_date_map = {row["SYMBOL"]: row[" DATE OF LISTING"] for row in reader}
  ```
</CodeGroup>

### Data Merging Process

The script iterates through all stocks and merges data sections:

```python theme={null}
for item in data:
    symbol = item.get("Symbol", "UNKNOWN")
    tech = dhan_tech_map.get(symbol, {})
    adv_tech = adv_tech_map.get(symbol, {})
    
    # Extract quarterly data
    cq = item.get("incomeStat_cq", {})
    cy = item.get("incomeStat_cy", {})
    ttm_cy = item.get("TTM_cy", {})
    cv = item.get("CV", {})
    roce_roe = item.get("roce_roe", {})
    
    # Build unified record
    analyzed_data.append({
        "Symbol": symbol,
        "Name": tech.get("DispSym"),
        "Market Cap(Cr.)": tech.get("Mcap"),
        "P/E": cv.get("PE"),
        "ROE(%)": roce_roe.get("ROE"),
        # ... 81 more fields
    })
```

### Output Files

| File                                   | Description          | Size    | Records |
| -------------------------------------- | -------------------- | ------- | ------- |
| `all_stocks_fundamental_analysis.json` | **Base master JSON** | \~45 MB | 2,775   |

### Output Structure

<CodeGroup>
  ```json Sample Record theme={null}
  {
    "Symbol": "RELIANCE",
    "Name": "Reliance Industries Ltd.",
    "Listing Date": "29-NOV-1977",
    "Basic Industry": "Petroleum Products",
    "Sector": "Oil, Gas & Consumable Fuels",
    "Index": "NIFTY 50, NIFTY ENERGY",
    
    "Market Cap(Cr.)": "1825000",
    "Stock Price(₹)": "2468.75",
    "P/E": "28.5",
    "ROE(%)": "8.2",
    "ROCE(%)": "9.1",
    "D/E": "0.52",
    
    "Latest Quarter": "Dec-25",
    "Net Profit Latest Quarter": "17594",
    "Net Profit Previous Quarter": "16446",
    "EPS Latest Quarter": "26.10",
    "EPS Previous Quarter": "24.40",
    "Sales Latest Quarter": "245000",
    "OPM Latest Quarter(%)": "12.5",
    
    "QoQ Net Profit Change(%)": "7.0",
    "YoY Net Profit Change(%)": "43.3",
    
    "RSI (14)": "62.5",
    "SMA Status": "SMA 20: Above (4.9%) | SMA 50: Above (24.1%)",
    "EMA Status": "EMA 20: Above (6.3%) | EMA 200: Above (72.6%)",
    "Technical Sentiment": "RSI: Neutral | MACD: Bearish",
    "Pivot Point": "245.50",
    
    "1 Day Returns(%)": "1.2",
    "1 Week Returns(%)": "3.5",
    "1 Month Returns(%)": "8.2",
    "3 Month Returns(%)": "15.6",
    "1 Year Returns(%)": "45.3",
    "% from 52W High": "-5.2",
    "% from 52W Low": "72.8",
    
    "FII % change QoQ": "0.5",
    "DII % change QoQ": "-0.3",
    "Free Float(%)": "50.4"
  }
  ```

  ```plaintext Field Categories (86 Total) theme={null}
  1. Identity & Classification (6 fields)
     - Symbol, Name, Listing Date, Basic Industry, Sector, Index

  2. Fundamentals - Quarterly (40 fields)
     - Net Profit (5 quarters)
     - EPS (5 quarters)
     - Sales (5 quarters)
     - OPM (5 quarters)
     - QoQ/YoY changes

  3. Valuation Ratios (12 fields)
     - Market Cap, Stock Price, P/E, Forward P/E, Historical P/E 5
     - PEG, ROE, ROCE, D/E, OPM TTM, EPS Last Year, EPS 2 Years Back

  4. Ownership (3 fields)
     - FII % change QoQ, DII % change QoQ, Free Float(%)

  5. Technical Indicators (5 fields)
     - RSI (14), SMA Status, EMA Status, Technical Sentiment, Pivot Point

  6. Price Performance (9 fields)
     - 1 Day/Week/Month/3M/6M/1Y Returns
     - % from 52W High/Low
     - Gap Up %, Day Range(%)

  7. Placeholder Fields (Phase 4 will populate)
     - RVOL, ADR, ATH → Populated by advanced_metrics_processor.py
     - Event Markers, Announcements, News Feed → Populated by add_corporate_events.py
     - F&O Flag, Lot Size, Next Expiry → Populated by enrich_fno_data.py
  ```
</CodeGroup>

***

## Data Processing Steps

### Step 1: Load All Data Sources

```python theme={null}
print("Loading fundamental data...")
with open(input_file, "r") as f:
    data = json.load(f)

print("Loaded technical data for {len(dhan_tech_map)} symbols.")
print("Loaded advanced indicators for {len(adv_tech_map)} symbols.")
print("Loaded listing dates for {len(listing_date_map)} symbols.")
```

### Step 2: Extract Quarterly Fundamentals

```python theme={null}
# Parse pipe-separated quarterly values
net_profit_latest = get_value_from_pipe_string(cq.get("Net_Profit"), 0)
net_profit_prev = get_value_from_pipe_string(cq.get("Net_Profit"), 1)
net_profit_2q = get_value_from_pipe_string(cq.get("Net_Profit"), 2)
net_profit_3q = get_value_from_pipe_string(cq.get("Net_Profit"), 3)
net_profit_last_yr = get_value_from_pipe_string(cq.get("Net_Profit"), 4)

# Calculate QoQ and YoY changes
qoq_change = calculate_change(net_profit_latest, net_profit_prev)
yoy_change = calculate_change(net_profit_latest, net_profit_last_yr)
```

### Step 3: Merge Technical Indicators

```python theme={null}
# From dhan_data_response.json
rsi = tech.get("DayRSI14CurrentCandle", 0)
sma_50_distance = tech.get("DaySMA50CurrentCandle", 0)
sma_200_distance = tech.get("DaySMA200CurrentCandle", 0)

# From advanced_indicator_data.json
pivot_point = adv_tech.get("PivotPoint")
ema_status = adv_tech.get("EMA_Status")
sma_status = adv_tech.get("SMA_Status")
technical_sentiment = adv_tech.get("Technical_Sentiment")
```

### Step 4: Build Unified Record

```python theme={null}
analyzed_data.append({
    # Identity
    "Symbol": symbol,
    "Name": tech.get("DispSym"),
    "Listing Date": listing_date_map.get(symbol, "N/A"),
    
    # Fundamentals
    "Net Profit Latest Quarter": net_profit_latest,
    "QoQ Net Profit Change(%)": qoq_change,
    "YoY Net Profit Change(%)": yoy_change,
    
    # Technical
    "RSI (14)": rsi,
    "SMA Status": sma_status,
    "Pivot Point": pivot_point,
    
    # Placeholders for Phase 4
    "RVOL": 0,
    "5 Days MA ADR(%)": 0,
    "% from ATH": 0,
    "Event Markers": [],
    "Recent Announcements": [],
    "News Feed": []
})
```

### Step 5: Save Output

```python theme={null}
with open(output_file, "w") as f:
    json.dump(analyzed_data, f, indent=4)

print(f"✅ Analysis complete. Saved {len(analyzed_data)} stocks to {output_file}")
```

***

## Field Calculation Examples

### Quarterly Changes (QoQ, YoY)

```python theme={null}
def calculate_change(current, previous):
    if previous == 0:
        return 0.0
    return ((current - previous) / abs(previous)) * 100

# Example: Net Profit QoQ
# Latest: 17594, Previous: 16446
qoq = ((17594 - 16446) / 16446) * 100 = 7.0%
```

### Pipe String Parsing

```python theme={null}
def get_value_from_pipe_string(pipe_string, index):
    # Input: "17594|16446|15138|17955|12273"
    # Index 0 → 17594 (Latest Quarter)
    # Index 1 → 16446 (Previous Quarter)
    # Index 4 → 12273 (Last Year Same Quarter)
    
    parts = pipe_string.split('|')
    if index < len(parts):
        return float(parts[index])
    return 0.0
```

### SMA/EMA Status Formatting

```python theme={null}
# From advanced_indicator_data.json:
{
  "SMA_20_Distance": 4.9,
  "SMA_50_Distance": 24.1,
  "EMA_20_Distance": 6.3,
  "EMA_200_Distance": 72.6
}

# Formatted output:
"SMA Status": "SMA 20: Above (4.9%) | SMA 50: Above (24.1%)"
"EMA Status": "EMA 20: Above (6.3%) | EMA 200: Above (72.6%)"
```

***

## Dependencies

### Required from Phase 1

* `fundamental_data.json` (CRITICAL)
* `master_isin_map.json` (for iteration)
* `dhan_data_response.json` (for technical data)
* `nse_equity_list.csv` (for listing dates)

### Required from Phase 2

* `advanced_indicator_data.json` (for SMA/EMA/Pivot)

### Optional (Soft Dependencies)

* If any file is missing, the script continues but fields will be empty/0

***

## Typical Execution Time

<Note>
  **\~30-60 seconds** — Pure in-memory data merging (no API calls)
</Note>

### Performance Breakdown

* Load all JSONs: \~5s
* Iterate 2,775 stocks: \~20s
* Write output JSON: \~10s

***

## Error Handling

### Critical Failure Detection

```python theme={null}
results["bulk_market_analyzer.py"] = run_script("bulk_market_analyzer.py", "Phase 3")

if not results["bulk_market_analyzer.py"]:
    print("🛑 CRITICAL: bulk_market_analyzer.py failed.")
    print("   Cannot produce all_stocks_fundamental_analysis.json.")
    return  # Pipeline stops
```

### Non-Critical Missing Files

```python theme={null}
try:
    with open(ADVANCED_FILE, "r") as f:
        adv_data = json.load(f)
except FileNotFoundError:
    print(f"Warning: {ADVANCED_FILE} not found. Running without advanced indicators.")
    adv_tech_map = {}  # Empty map, fields will be null
```

***

## Output Validation

### File Size Check

```bash theme={null}
ls -lh all_stocks_fundamental_analysis.json
# Expected: ~45 MB (2,775 stocks × 86 fields)
```

### Record Count Check

```python theme={null}
import json

with open("all_stocks_fundamental_analysis.json", "r") as f:
    data = json.load(f)

print(f"Total stocks: {len(data)}")  # Expected: 2775
print(f"Fields per stock: {len(data[0].keys())}")  # Expected: 86
```

### Field Completeness Check

```python theme={null}
required_fields = [
    "Symbol", "Name", "Market Cap(Cr.)", "P/E", "ROE(%)",
    "Net Profit Latest Quarter", "EPS Latest Quarter",
    "RSI (14)", "SMA Status", "1 Year Returns(%)"
]

for stock in data:
    missing = [f for f in required_fields if f not in stock]
    if missing:
        print(f"{stock['Symbol']}: Missing fields {missing}")
```

***

## What Phase 3 Does NOT Include

The base JSON does **NOT** include the following (populated in Phase 4):

<Warning>
  These fields are **placeholders** (0 or empty arrays) in Phase 3:

  * **Advanced Metrics:** RVOL, ADR, ATH, Turnover, 200D EMA Volume
  * **Earnings Performance:** Returns since Earnings, Max Returns since Earnings
  * **F\&O Data:** F\&O Flag, Lot Size, Next Expiry
  * **Event Markers:** Surveillance, Insider Trading, Block Deals, Corporate Actions
  * **Recent Announcements:** Top 5 regulatory filings with PDF links
  * **News Feed:** Top 5 media news items with sentiment
</Warning>

***

## Phase 3 Output Summary

### Files Produced

```plaintext theme={null}
📦 Phase 3 Output:
└─ all_stocks_fundamental_analysis.json   (~45 MB, 2,775 records, 86 fields each)
```

### Field Coverage

* **Complete:** 60 fields (fundamentals, technicals, ratios, returns)
* **Placeholders:** 26 fields (to be populated in Phase 4)

***

## Next Phase

Once Phase 3 completes, the pipeline proceeds to:

<Card title="Phase 4: Enrichment Injection" icon="sparkles" href="/pipeline/phase-4-injection">
  Modifies the master JSON **in-place** to inject advanced metrics, F\&O data, earnings performance, and event markers. **Order matters!**
</Card>
